Deterministically annealed design of speech recognizers and its performance on isolated letters
نویسندگان
چکیده
We attack the general problem of HMM-based speech recognizer design, and in particular, the problem of isolated letter recognition in the presence of background noise. The standard design method based on maximum likelihood (ML) is known to perform poorly when applied to isolated letter recognition. The more recent minimum classiication error (MCE) approach directly targets the ultimate design criterion and ooers substantial improvements over the ML method. However, the standard MCE method relies on gradient descent optimization which is susceptible to shallow local minima traps. In this paper, we propose to overcome this dif-culty with a powerful optimization method based on deterministic annealing (DA). The DA method minimizes a randomized MCE cost subject to a constraint on the level of entropy which is gradually relaxed. It may be derived based on information-theoretic or statistical physics principles. DA has a low implementation complexity and outperforms both standard ML and the gradient descent based MCE algorithm by a factor of 1.5 to 2.0 on the benchmark CSLU spoken letter database. Further, the gains are maintained under a variety of background noise conditions.
منابع مشابه
Deterministically annealed design of hidden Markov model speech recognizers
Many conventional speech recognition systems are based on the use of hidden Markov models (HMM) within the context of discriminant-based pattern classification. While the speech recognition objective is a low rate of misclassification, HMM design has been traditionally approached via maximum likelihood (ML) modeling which is, in general, mismatched with the minimum error objective and hence sub...
متن کاملStatistical properties of infant-directed versus adult-directed speech: insights from speech recognition.
Previous studies have shown that infant-directed speech ('motherese') exhibits overemphasized acoustic properties which may facilitate the acquisition of phonetic categories by infant learners. It has been suggested that the use of infant-directed data for training automatic speech recognition systems might also enhance the automatic learning and discrimination of phonetic categories. This stud...
متن کاملDiscriminative feature weighting for HMM-based continuous speech recognizers
The Discriminative Feature Extraction (DFE) method provides an appropriate formalism for the design of the frontend feature extraction module in pattern classification systems. In the recent years, this formalism has been successfully applied to different speech recognition problems, like classification of vowels, classification of phonemes or isolated word recognition. The DFE formalism can be...
متن کاملCombining forward-based and backward-based decoders for improved speech recognition performance
Combining outputs of speech recognizers is a known way of increasing speech recognition performance. The ROVER approach handles efficiently such combinations. In this paper we show that the best performance is not achieved by combining the outputs of the best set of recognizers, but rather by combining outputs of recognizers that rely on different processing components, and in particular on a d...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کامل